
import Head from 'next/head'

<Head>
  <script>
    {
      `(function() {
         var _hmt = _hmt || [];
(function() {
  var hm = document.createElement("script");
  hm.src = "https://hm.baidu.com/hm.js?e60fb290e204e04c5cb6f79b0ac1e697";
  var s = document.getElementsByTagName("script")[0]; 
  s.parentNode.insertBefore(hm, s);
})();
       })();`
    }
  </script>
</Head>

![LangChain](https://pica.zhimg.com/50/v2-56e8bbb52aa271012541c1fe1ceb11a2_r.gif)






 基准测试模板 
 [#](#benchmarking-template "Permalink to this headline")
=================================================================================

> 基准测试模板 Benchmarking Template


这是一个示例笔记本，可用于为您选择的任务创建基准测试笔记本。评估真的很难，所以我们非常欢迎任何可以让人们更容易进行实验的贡献

强烈建议您在启用跟踪的情况下进行任何评估/基准测试。请参阅此处[here](https://langchain.readthedocs.io/en/latest/tracing) 了解什么是跟踪以及如何设置它。


 







```python
# Comment this out if you are NOT using tracing
import os
os.environ["LANGCHAIN_HANDLER"] = "langchain"

```







加载数据 Loading the data
 [#](#loading-the-data "Permalink to this headline")
-----------------------------------------------------------------------



首先，让我们加载数据。
 







```python
# This notebook should so how to load the dataset from LangChainDatasets on Hugging Face

# Please upload your dataset to https://huggingface.co/LangChainDatasets

# The value passed into `load_dataset` should NOT have the `LangChainDatasets/` prefix
from langchain.evaluation.loading import load_dataset
dataset = load_dataset("TODO")

```








设置链  Setting up a chain
 [#](#setting-up-a-chain "Permalink to this headline")
---------------------------------------------------------------------------



下一节应该有一个设置可以在此数据集上运行的链的示例。



预测 Make a prediction
 [#](#make-a-prediction "Permalink to this headline")
-------------------------------------------------------------------------


首先，我们可以一次预测一个数据点。在这种粒度级别上执行此操作允许use详细地探索输出，而且比在多个数据点上运行要便宜得多






```python
# Example of running the chain on a single datapoint (`dataset[0]`) goes here

```








 做很多预测 Make many predictions
 [#](#make-many-predictions "Permalink to this headline")
---------------------------------------------------------------------------------



 现在我们可以做出预测
 







```python
# Example of running the chain on many predictions goes here

# Sometimes its as simple as `chain.apply(dataset)`

# Othertimes you may want to write a for loop to catch errors

```








评估性能  Evaluate performance
 [#](#evaluate-performance "Permalink to this headline")
-------------------------------------------------------------------------------



任何以更系统的方式评估绩效的指南都在这里。




